GPT-o1, o3

OpenAI o1はどう作るのか（詳細編）

o1の要素技術の元になってるとされる論文(Quiet-STaR)を読んだ。実際使われた手法が公開されてないので推測にはなるが、以下感想。

学習データのスケーリング、モデルパラメータのスケーリングがともにcapして来た中で、推論時間のスケーリングという新しい探索方向を示したという意味で画期的。

learning to reasoning

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

O1 Replication Journey: A Strategic Progress Report – Part 1

o1 Proを使ってプロダクトのアイデア出しから実装までやってみる！

A Small Step Towards Reproducing OpenAI o1: Progress Report on the Steiner Open Source Models

大規模言語モデルのOpenAI、従来手法の限界を打破する新しいAI学習手法「test-time compute」を開発

OpenAI o1 System Card

2024年12月25日 o3はAGIの夢を見るか (週刊AI)

Search-o1: Agentic Search-Enhanced Large Reasoning Models

OpenAI o3-mini System Card